Show code
library(dplyr)
library(readr)
library(ggplot2)
library(tigris)
library(sf)
library(DT)Caitlin Uang
December 18, 2025
According to the CDC, 32% of adults in the United States consume fast food on any given day between August 2021 and August 2023, making access to fast food a relevant factor when examining health related risks. This project analyzes fast food restaurant distribution across the country to answer the question: How many fast food restaurants per capita are there in each state and county?
The motivation for this analysis is rooted in the overarching question: Are food deserts in the US linked to health problems like diabetes? For further information about this project and to see other analytical questions tied to this investigation, click here.
Using the NaNDA: Eating and Drinking Places by Census Tract & ZCTA, 1990–2021 with the U.S. Census Bureau American Community Survey B01003 (2019), the team analyze the availability of fast food restaurants across the United States and in which state and county can we find the highest density of fast food restaurants to support the team’s overarching question.
To acquire the national fast food count per state and county, this project will use the NaNDA: Eating and Drinking Places by Census Tract & ZCTA, 1990-2021 dataset and filter year 2019. The dataset at census tract level is used. NaNDA is the National Neighborhood Data Archive, providing counts and density of food establishments (restaurants, bars, fast food) from 1990-2021, sourced from the National Establishment Time Series (NETS) database.
# Cache shapefiles locally (faster repeat runs)
options(tigris_use_cache = TRUE)
# Load dataset and use only 2019 fast food count
eatdrink_tract <- read_csv("nanda_eatdrink_Tract20_1990-2021_01P.csv")
fastfood_raw <- eatdrink_tract |>
filter(year == 2019) |>
arrange(desc(count_fastfood)) |>
select(tract_fips20,totpop,count_fastfood)The U.S. Census Bureau’s American Community Survey (ACS) Table B01003 provides estimates of total population for a given geographic area, serving as a foundational demographic measure in this analysis. The ACS B01003 table used here is based on the 2019 five year estimates, which are designed to provide more reliable population figures, particularly for smaller geographic units such as census tracts and counties.
The NaNDA dataset identifies locations using census tract–level FIPS codes, which are standard numeric identifiers assigned to each census tract in the U.S, while ACS B01003 uses GEOID, which are longer numeric codes to uniquely identify all administrative and statistical geographic areas. GEOIDs encompasses FIPS codes and will be used to join both datasets. Using GEOIDs provides a reliable way to align fast food counts with population figures at the census tract level without losing geographic accuracy.
# Get FIPS from GEOID
pop_clean <- pop_2019 |>
mutate(tract_fips20 = substr(GEO_ID, nchar(GEO_ID) - 10, nchar(GEO_ID)))
# Join ACS and fast food dataset by FIPS code and filter out tract with 0 population
fastfood_tract <- fastfood_raw |>
left_join(pop_clean, by = "tract_fips20") |>
select(tract_fips20,GEO_ID,totpop,B01003_001E,count_fastfood) |>
rename(acs_pop = B01003_001E) |>
mutate(acs_pop = as.numeric(acs_pop)) |>
filter(acs_pop > 0)Let’s look at the joined dataset. Now that we have both dataset combined, we can calculate the
fastfood_tract |>
select(
FIPS = tract_fips20,
GEOID = GEO_ID,
`Total Population` = acs_pop,
`Fast Food Count` = count_fastfood
) |>
datatable(
caption = "Fast Food Restaurants and Population by Census Tract",
rownames = FALSE,
options = list(
pageLength = 10,
scrollX = TRUE
)
) |>
formatRound("Total Population", 0, mark = ",") |>
formatRound("Fast Food Count", 0)The distribution of fast food restaurants across census tract is highly right skewered as there is a large number of census tracts with no fast food restaurants. As the number of restaurants increase, the number of census tract decreases rapidly.
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 0.000 0.000 0.815 1.000 22.000
fastfood_tract |>
ggplot(aes(x = count_fastfood)) +
geom_histogram(
bins = 50,
fill = "#9badff",
color = "white",
linewidth = 0.3
) +
labs(
title = "Distribution of Fast Food Count by Census Tract",
caption = "Source: NaNDA Eating and Drinking Places by Census Tract & ZCTA, 1990–2021\nU.S. Census Bureau ACS B01003 (2019)",
x = "Fast Food Count",
y = "# of Census Tracts"
) +
theme(
plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
plot.caption = element_text(size = 10, face = "italic", hjust = 0)
)The distribution of population across census tracts is relatively contained and mildly right skewed between 2,000 and 6,000 people and very few extreme outliers. The median falls at 3,852, representing the population of a typical census tract.
Min. 1st Qu. Median Mean 3rd Qu. Max.
2 2792 3852 3981 5008 38754
fastfood_tract |>
ggplot(aes(x = acs_pop)) +
geom_histogram(
bins = 50,
fill = "#9badff",
color = "white",
linewidth = 0.3
) +
geom_vline(
xintercept = median(fastfood_tract$acs_pop, na.rm = TRUE),
linetype = "dashed",
linewidth = 0.8,
color = "#0432FF"
) +
labs(
title = "Distribution of Population by Census Tract",
caption = "Source: NaNDA Eating and Drinking Places by Census Tract & ZCTA, 1990–2021\nU.S. Census Bureau ACS B01003 (2019)",
x = "Population",
y = "# of Census Tracts"
) +
theme(
plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
plot.caption = element_text(size = 10, face = "italic", hjust = 0)
)Now that we have a better understanding of the distribution of fast food counts and population by census tract, the team shifted focus on analyzing the fast food density, measured by count of fast food restaurants per 1,000 people to have a relative comparison of fast food access across census tracts. During the early days of this project, the team found that looking only at just the count of fast food restaurants is misleading because census tracts vary in the number of people living in them, while fast food restaurants are unevenly distributed. For example, states with larger total population have larger amount of fast food restaurants available. Using fast food density allows for a more meaning comparison as it accounts for population size.
In the next section, we will look at fast food density at the county and state level to examine any patterns that can help us answer the overarching question.
To examine fast food density at the national scale, we first aggregated fast food restaurant counts from the census tract level to the state level. Looking at the data at a state level reduces local variability caused by small population census tracts and also provide a clearer view of regional differences in fast food restaurant availability. The state FIPS code is extracted from each census tract, and fast food restaurant counts and population are summed before calculating the fast food density by dividing the count of fast food restaurants and population then multiplying by 1,000. Next, we used a U.S. State boundary shapefile for 2019 and joined it with the aggregated fast food density to create a choropleth map for visualization.
# Aggregate tract to state
fastfood_state <- fastfood_tract |>
mutate(state_fips20 = substr(tract_fips20, 1, 2)) |>
group_by(state_fips20) |>
summarize(
total_fastfood = sum(count_fastfood, na.rm = TRUE),
total_pop = sum(acs_pop, na.rm = TRUE),
den_fastfood = (total_fastfood / total_pop) * 1000,
.groups = "drop"
) |>
filter(total_pop > 0)
# Load state shapefile and join
state_shapes <- states(year = 2019, cb = TRUE) |>
mutate(STATEFP = as.character(STATEFP))
fastfood_map_state <- state_shapes |>
left_join(fastfood_state, by = c("STATEFP" = "state_fips20"))This choropleth shows a clear regional pattern, especially in the South with a darker shade indicating higher fast food density than the Northeast, Midwest, and West. There is a notable exception in the West. Let’s now look at a datatable with state level fast food density values and its corresponding fast food count and total popultation.
# Create heat map
ggplot(fastfood_map_state) +
geom_sf(aes(fill = den_fastfood), color = "white") +
scale_fill_gradient(
low = "#ffffff",
high = "#932092",
name = "Fast Food\nDensity", limits = c(NA,NA)
)+
coord_sf(
xlim = c(-125, -66),
ylim = c(24, 50),
expand = FALSE
) +
labs(
title = "Fast Food Density by State",
caption = "Source: NaNDA Eating and Drinking Places by Census Tract & ZCTA, 1990–2021\nU.S. Census Bureau ACS B01003 (2019)"
) +
theme_void() +
theme(
legend.position = "right",
plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
plot.margin = margin(5, 5, 5, 5),
plot.caption = element_text(size = 10, face = "italic", hjust = 0)
)These results show a clear trend of Southern states dominating the top rankings with 8 of out 10 states with highest fast food density. Mississippi ranks the highest with 0.30 fast food restaurants per 1,000 people. Wyoming stands out as an outlier outside the South, sitting at rank 3 with a fast food density of 0.28.
This pattern suggests that the South and select western regions face challenges in accessing healthy food and is saturated with convenient fast food options.
# Create datatable
datatable(
fastfood_map_state |>
st_drop_geometry() |>
arrange(desc(den_fastfood)) |>
slice_max(den_fastfood, n = 10) |>
select(
State = NAME,
`Fast Food Density` = den_fastfood,
`Total Population` = total_pop,
`Fast Food Count` = total_fastfood),
rownames = FALSE,
caption = "Top 10 States with the Highest Fast Food Density",
options = list(
searching = FALSE,
scrollX = FALSE,
autoWidth = TRUE
)
) |>
formatRound("Fast Food Density", 2) |>
formatRound("Total Population", 0, mark = ",") |>
formatRound("Fast Food Count", 0)While the state level analysis provides an insightful high level overview of fast food restaurant availability across the United States, it can prevent us from seeing substantial variation within states. Larger and more diverse states may have both urban counties with relatively low fast food density and rural counties with disproportionately higher fast food density when adjusted for population size. To better capture this local heterogeneity and identify potential counties with higher fast food density, let’s go on with a more granular view of the analysis and shift from state level to county level. Examining the joined dataset at the county level allows for a more granular understanding of how fast food restaurants are distributed and helps reveal patterns that may be more directly relevant to county level health problems like diabetes and obesity and quality food access that can tie in with the overarching question.
Similar to the state level analysis, let’s take a look at the choropleth map at the county level to see any patterns.
#State look up
state_lookup <- states(year = 2019) |>
st_drop_geometry() |>
select(STATEFP, STUSPS, NAME) |>
rename(
State_Abbr = STUSPS,
State_Name = NAME
)
# Aggregate tract to county
den_fastfood_county <- fastfood_tract |>
mutate(county_fips20 = substr(tract_fips20, 1, 5)) |>
group_by(county_fips20) |>
summarize(
total_fastfood = sum(count_fastfood, na.rm = TRUE),
total_pop = sum(acs_pop, na.rm = TRUE),
den_fastfood = (total_fastfood / total_pop) * 1000,
.groups = "drop"
) |>
filter(total_pop > 0)
# Load shapefile and join
county_shapes <- counties(year = 2019, cb = TRUE) |>
mutate(GEOID = as.character(GEOID))
fastfood_map_county <- county_shapes |>
left_join(den_fastfood_county, by = c("GEOID" = "county_fips20")) |>
left_join(state_lookup, by = "STATEFP") |>
mutate(
den_fastfood_plot = ifelse(den_fastfood == 0, NA, den_fastfood)
)ggplot(fastfood_map_county) +
geom_sf(aes(fill = den_fastfood), color = NA, linewidth = 0.1) +
scale_fill_gradientn(
colours = c("grey90", "#e792e7", "#5b7cff", "#0432FF"),
name = "Fast Food\nDensity",
limits = c(NA, NA),
na.value = "#ffffff",
trans = "sqrt"
) +
coord_sf(
xlim = c(-125, -66),
ylim = c(24, 50),
expand = FALSE
) +
labs(
title = "Fast Food Density by County",
subtitle = "County, 2019",
caption = "Source: NaNDA Eating and Drinking Places by Census Tract & ZCTA, 1990–2021\nU.S. Census Bureau ACS B01003 (2019)"
) +
theme_void() +
theme(
legend.position = "right",
plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
plot.margin = margin(5, 5, 5, 5),
plot.caption = element_text(size = 10, face = "italic", hjust = 0)
)fastfood_map_county |>
st_drop_geometry() |>
ggplot(aes(x = den_fastfood)) +
geom_histogram(
bins = 30,
fill = "#9badff",
color = "white",
linewidth = 0.3
) +
coord_cartesian(xlim = c(0, 1)) +
labs(
title = "Fast Food Restaurants per 1,000 People",
subtitle = "Distribution of Fast Food Density per County, 2019",
caption = "Source: NaNDA Eating and Drinking Places by Census Tract & ZCTA, 1990–2021\nU.S. Census Bureau ACS B01003 (2019)",
x = "Fast Food Density",
y = "# of Counties"
) +
theme(
axis.line = element_line(color = "black", linewidth = 0.4),
plot.title = element_text(size = 18, face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
plot.margin = margin(5, 5, 5, 5),
plot.caption = element_text(size = 10, face = "italic", hjust = 0),
panel.background = element_rect(fill = "white")
)# Create datatable
datatable(
fastfood_map_county |>
st_drop_geometry() |>
arrange(desc(den_fastfood)) |>
slice_max(den_fastfood, n = 10) |>
select(
County = NAME,
State = State_Name,
`Fast Food Count` = total_fastfood,
`Total Population` = total_pop,
`Fast Food Density` = den_fastfood
),
caption = "Fast Food Restaurants per 1,000 People by County (2019)",
rownames = FALSE,
options = list(
pageLength = 10,
scrollX = TRUE
)
) |>
formatRound("Fast Food Count", 0, mark = ",") |>
formatRound("Total Population", 0, mark = ",") |>
formatRound("Fast Food Density", 2)